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Abstract — This paper presents a method for automated heal- 
ing as part of off-line automated troubleshooting. The method 
combines statistical learning with constraint optimization. The 
automated healing aims at locally optimizing radio resource 
management (RRM) or system parameters of cells with poor 
performance in an iterative manner. The statistical learning pro- 
cesses the data using Logistic Regression (LR) to extract closed 
form (functional) relations between Key Performance Indicators 
(KPIs) and Radio Resource Management (RRM) parameters. 
These functional relations are then processed by an optimization 
engine which proposes new parameter values. The advantage 
of the proposed formulation is the small number of iterations 
required by the automated healing method to converge, making 
it suitable for off-line implementation. The proposed method is 
applied to heal an Inter-Cell Interference Coordination (ICIC) 
process in a 3G Long Term Evolution (LTE) network which 
is based on soft-frequency reuse scheme. Numerical simulations 
illustrate the benefits of the proposed approach. 

Keywords: Statistical learning, Logistic Regression, au- 
tomated healing, troubleshooting. Inter-cellular Interference 
Coordination, LTE. 

I. Introduction 

Efficient management of future Beyond 3G and 4G net- 
works is a major challenge for network operators |[T|. The 
wireless ecosystem is becoming more and more heterogeneous 
with co-existing/co-operating technologies and deployment 
scenarios (i.e. macro, micro, pico and femto cell structures). 
Fault management or troubleshooting is an important building 
block of network operation. Troubleshooting comprises three 
functionalities: fault detection (i.e. detecting failures or poor 
performance as soon as they occur); fault diagnosis (i.e. 
determining the cause of failure or of poor performance), and 
fault recovery or healing (i.e. repairing the problem) |2|. 
The importance of efficient fault management has motivated 
the development of automated methods and tools for diag- 
nosis and healing. In this work, the main focus is given to 
automated healing. It is supposed that a given cell with poor 
performance has been identified (fault detection) and the cause 
of the degraded performance has been diagnosed as a bad 
setting of a specific Radio Resource Management (RRM) or 
a system parameter (fault diagnosis) 12J|i3J. The automated 
healing process aims at locally optimizing the value of this 
parameter, taking into account the Key Performance Indicators 
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(KPIs) of the faulty cell and those of its neighbours. In other 
words, the RRM parameter that is found as the fault cause by 
the diagnosis process is optimized by the automated heaUng 
module. 

Local type of optimization or "steered optimization" has been 
studied in the literature, based on combinatorial optimization 
in conjunction with the interference matrix to tackle local 
problems detected in the network H. This approach uses a 
network simulator and can be implemented as an advanced 
functionality of a cell planning tool. The focus of this paper 
is to develop an automated healing method based on measure- 
ments. More precisely, our aim is to conceive an automated 
healing method that uses statistical learning of measured 
data and constraint optimization. The method is denoted as 
Statistical Leai-ning Automated Healing (SLAH). The SLAH 
module can be located at the management plane, e.g. in the 
Operation and Maintenance Centre (OMC) where abundant 
data is available. The method is iterative with a time resolution 
of a day, and should therefore converge in a few iterations. 
To achieve this requirement, a statistical learning approach 
using Logistic Regression (LR) is proposed that extracts the 
functional relations between KPIs and RRM parameters and 
comprises the statistical model. It is noted that the data is noisy 
due to the random character of the traffic and of the radio 
channel, but also due to imprecisions of the measurements. 
After each iteration, the statistical model is updated using the 
additional data and its precision is improved. The model is 
then introduced into the optimization engine and is processed 
directly to derive the next RRM parameter This approach has 
the merit of converging rapidly. The performance of the SLAH 
is evaulated on an interference mitigation use case, namely an 
Inter-Cell Interference Coordination (ICIC) problem in a LTE 
network.The choice of this use case is motivated by the im- 
portance of interference mitigation in OFDMA (LTE/WiMax) 
networks, since it allows to improve the system performance, 
and particularly, to reach the strict requirements for cell edge 
bit rates (defined in B3G and 4G network standards, such as 
|5|). In this context, ICIC is one of the efficient approaches to 
mitigate interference. Interference mitigation techniques such 
as ICIC can considerably improve Signal to Interference plus 
Noise (SINR) and hence bit rates, particularly at cell edge. 
As a result, better network performance and user Quality of 
Service (QoS) are achieved, including reduced File Transfer 
Time (FTT), Block Call Rate (BCR) and Drop Call Rate 
(DCR). Different interference mitigation methods have been 



proposed for OFDMA systems, such as fractional reuse and 
soft reuse schemes fSl-fS). When different power allocation 
for the mobile users is associated with different portions of 
the frequency bandwidth, the frequency reuse is called a soft 
reuse scheme, and will be considered here in the context of 
automated healing. 

The paper is organized as follows: Section II introduces the 
concept and the system model for the SLAH and explains its 
different building blocks. Section III describes the LTE ICIC 
model that is used in the SLAH case study. The adaptation of 
the SLAH to heal the ICIC process is developed in Section 
IV. Numerical results are presented in Section V followed by 
concluding remarks in Section VI. 

II. System Model for Automated Healing 

It is assumed that the fault cause has been diagnosed 
as a specific RRM parameter (such as handover/mobility, 
admission and congestion control thresholds) whose value has 
degraded the performance of the eNodeB (eNB). An example 
of such a case is presented in |3| where the bad setting of the 
add/drop window of a NodeB in a UMTS system is diagnosed. 
The purpose of the SLAH is to iteratively optimize the value 
of this RRM parameter using local information from the eNB 
and its neighbors. Hence, the automated healing is a local 
optimization process. The SLAH block diagram is presented 
in Figure [T] The system model comprises four blocks: 
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Fig. L SLAH block diagram. 

Initialization block: The Initialization block provides the initial 
RRM parameter to the faulty eNB in the Network/Simulator 
block and to the Statistical Learning block. 
Network/Simulator block: The Network/Simulator block rep- 
resents the real network or the network simulator It measures 
(case of real network) or calculates (case of network simulator) 
a set of KPIs of an eNB and of its neighbors for each 
new RRM parameter introduced by the Initialization or the 
Optimization block. 

Statistical Learning block: The Statistical Learning block ex- 
tracts the functional relations, known as the statistical model, 
between the KPIs and the RRM parameter through Logistic 
Regression (LR) Q. LR fits the data into the functional 
form denoted as logistic function: fiog{z) ~ i+S"^^- The 
fiog{z) can describe saturation effects at its extremities as 
often encountered in KPIs in communication networks. 



sample value Xi of the explanatory variable x (i.e. the RRM 
parameter). LR models ym.i as follows: 
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where rim,i = /3m, o + Pm.iXi is the linear predictor repre- 
senting the contribution of the explanatory variable sample 
Xi, and ei is the the residual error. The /3s are the regression 
coefficients whose values have to be estimated using maximum 
likelihood estimation 110|. Hence from ^, the functional 
relation between y, i.e., y estimated by LR, and x can be 
written as: 

ym{x) = flogiPmfi + /3m, ix) (2) 

Optimization block: The Optimization block calculates the 
optimal RRM value using the current statistical model. It 
determines x, i.e., the value of the RRM parameter x that 
minimizes a cost function of a set of KPIs denoted as the 
optimization set Ao, subject to constraints on a second set of 
KPIs denoted as the constraint set Ac- Considering that ym{x) 
has the functional relation form as in (|2]i, the optimization 
problem can be formulated as: 

X — argminxC{x) (3) 

where C — X^meA "^mymix) is the cost function and w„i is 
the weight given to y,n{x). 

The automated healing process is iterative. At each 
iteration, a new RRM parameter value is proposed (by 
the Initialization block during the initialization iterations 
and by the Optimization block during the optimization 
iterations) to update the RRM setting of the faulty eNB in the 
Network/Simulator block. The performance of the faulty eNB 
and of its neighbors with this new RRM value is assessed 
by the Network/Simulator block through a set of KPI values 
obtained at the end of the measurement period (typically 
one day). Thus, a data point comprising a RRM parameter 
value and the corresponding KPIs is obtained. This data point 
together with the previously obtained data points are used by 
the Statistical Learning block to refine the statistical model 
which is then used by the Optimization block to generate 
the RRM parameter value of the next iteration. Thus, as 
the iterations progress, on the average, the model precision 
improves and is used by the Optimization block to find a 
better value for the RRM parameter. 

III. System Model for Interference Mitigation 

The performance of the proposed automated healing method 
is evaluated on an ICIC scheme which uses soft-frequency 
reuse. Consider a downlink ICIC scheme that combines two 
resource allocation mechanisms: Physical Resource Block 
(PRB) allocation to frequency subbands and coordinated 
power allocation. In the soft-reuse one scheme, the total avail- 
able bandwidth is reused in all the cells while the transmitted 
power for a portion of the bandwidth of a cell can be adapted 
to solve interference related QoS problems. Figure |2] presents 
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Fig. 2. System Model. 



the power- frequency allocation model in a seven adjacent 
cell layout. The frequency band is divided into three disjoint 
subbands. One subband is allocated to mobiles with the worst 
signal quality and is denoted interchangeably as a protected 
band or as an edge band with transmit power P. A user 
with poor radio conditions is often situated at the cell edge, 
but could also be closer to the base station and experience 
deep shadow fading. The remaining two frequency subbands 
are denoted as centre bands with transmit power reduced by 
a factor a, namely aP. The interference produced by an 
eNB to its neighbours can be controlled by the parameter a 
of this eNB. The main downlink interference in the system 
originates from eNB transmissions on the centre band (to 
centre cell users) which interfere with neighboring cell edge 
users utilizing their edge (protected) band. When an eNB 
strongly interferes with the users of its neighbours, the ICIC 
mechanism allows to reduce the transmission power for the 
centre band. 

Resource block allocation is performed based on a priority 
scheme for accessing the protected subbands. A quality met- 
ric qu is calculated using pilot channel signal strengths as 
7 . Here s stands for the serving eNB of user 
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u, Ptju denotes the mean pilot power received by the user 
w of a signal transmitted by the eNB j, and a1 is the noise 
power spectral density, qu is similar to the SINR with the 
difference that in the present ICIC scheme, the data channels 
used to calculate the SINR are subject to power control. The 
qu metric is calculated for all users which are then sorted 
according to this metric. Users with the worst q^ are allocated 
resources from the protected band and benefit from maximal 
transmission power of the eNB. When the protected subband 
is full, the resource block allocation continues from the centre 



band. 

Note that the soft-reuse ICIC scheme is characterized by 
two other parameters in addition to a: 1) the number of 
PRBs assigned to the center and edge bands; 2) the threshold 
that determines the boundary between center and cell edge 
users. In this work, for reasons of simplicity, we deal only 
with one parameter. The proposed algorithm can easily be 
generalized to multiple RRM case, however with an increased 
complexity. The choice of the a parameter is motivated by 
the simplicity in its implementation, which is carried out by a 
simple power control on a pre-defined set of subcarriers while 
the other two parameters require modifications of the PRB 
scheduling strategy. Nevertheless, the proposed algorithm is 
equally applicable to the other two parameters without any 
major alterations. 

IV. SLAH FOR Interference Mitigation 

This section describes the adaptation of the SLAH to 
interference mitigation in a LTE network by locally optimizing 
the parameter a of the interfering eNBs. Denote by eNBc 
ic standing for central) an eNB with degraded performance. 
It is assumed that the cause of the degraded performance 
has been diagnosed and is related to excessive inter-cell 
interference which can be effectively mitigated by a soft- 
reuse ICIC scheme. The first tier neighbours of eNBc are 
denoted by eNBj, j G NSl where NSl is the index set 
of the first-tier neighbours of eNBc- The specificity of the 
interference mitigation use case is the following: to heal 
eNBc, the parameters aj of eNBj, j £ NSl , are optimized, 
while ac of eNBc remains unchanged. 
We use the notion of coupling between eNBj and eNBc 
which is expressed in terms of the interference that eNBj 
produces on the users connected to eNBc and can be written 
in terms of the downlink interference matrix element Icj H . 
Hence the bigger the Icj , the stronger the coupling between the 
two eNBs. Note that the matrix element Icj is equal to the time 
average of the sum of interferences perceived by the mobiles 
attached to eNBc and generated by downlink transmissions 
to the mobiles of eNBj. Denote by s, s e NSl, the index 
of the eNB which is the most coupled with eNBc, namely 
s — argmaxj{lcj), j G NSl. To reduce the complexity of 
the SLAH process, we propose to adjust the aj parameter 
according to the degree of coupling between eNBj and eNBc. 
Hence, we define a functional relation between as and aj that 
accounts for the coupling mentioned above: 
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Note that the smaller the coupling between eNBj and eNBc, 
the lesser the power reduction applied to eNBj. Thus, by 
using (IDl, only as needs to be optimized instead of all the 
first tier ajS. The self-healing process can be performed 
simultaneously on any number of eNBs provided they are not 
direct neighbours. 

The SLAH aims at minimizing the FTT for eNBc and of its 
first-tier neighbours while verifying constraints on their BCRs 
{BCRj, j e cU NSl). We define the cost function for the 



Initialization: 

1. Identify the most coupled eNB eNBs with eNBc among the 
neighbours in NSl 

2. Generate the initial set of k data points Pi, j Si cU NSl, by applying 
k different as values (together with the associated Oj values) to the 
network/simulator one by one and obtaining the corresponding KPIs. 
Repeat until convergence: 

3. For each eNBj, compute the statistical model using LR for FTT and 
BCR using the corresponding data points in P-J 

4. Compute a new a vector containing the new values of cij, j g NSl 
(using equations (4) and jT)) 

5. Apply the new aj values to the network/simulator and observe (FTTj) 
and (BCRj), j G c U NSl. Compute the new data point p^ , j 

6. Update P^+,:P^:,, = P|Up^, 

7. k=k+l 
End Repeat 



TABLE I 

The complete SLAH Algorithm 



optimization as follows: 



C = FTTc 
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It is noted that FTTj is a function of aj and hence, via 
equation (HI, of a^. The weighting coefficients ojj depend on 
the relative contribution of Icj with respect to the sum on all 
eNBs in NSl and are given by: 



(6) 
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satisfying the condition ^ 

problem can now be formulated as follows: 



eNSi ^i 



1. The optimization 



argmma'C{ag^ 
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subject to BCRj < BCRth ; j 6 c U NSl. BCRth is 
the threshold for BCRj above which communication quality 
is unacceptably poor The FTT and BCR indicators in 
equations (|5]l and (|7]i, are given in the form of LR function (|2]i 
obtained using the LR. 

The SLAH can be further improved by introducing a gen- 
eralized interference matrix element /' in equation (|4|l by 
introducing an additional KPI, namely the BCR: 
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One can see that the higher the Bj (i.e. the normalized 
BCRj), the smaller the /' and consequently, the smaller the 
modification of aj. The significance of equation (|8]l is that, 
in order to improve the performance of eNBc, the decrease 
made in aj due to Icj is limited by the degradation in BCRj . 
Note that the constant 7 allows tuning the effect of Bj. 
Denote a data point pi, as the vector pj. — 
{aj,FTTj,BCRj)k, where j £ cU NSl and k denotes the 
iteration index. Since the SLAH starts with an initial data 
point and generates a new data point at each iteration, the 
iteration index equals the total number of generated data 
points. The set of k data points for eNBj, j G c U NSl, is 
denoted by P^ . The SLAH algorithm is given in Table |T] 



V. Case Study 

A. Simulation Scenario 

A LTE network comprising 45 eNBs in a dense urban envi- 
ronment and having bandwidth of 5MHz is simulated using the 
MATLAB simulator described in IfTTI . The simulator performs 
correlated Monte Carlo snapshots with a time resolution of 
1 second to account for the time evolution of the network. 
FTP traffic with a file size of 6300 Kbits is considered. 
Call arrivals are generated using the Poisson process and the 
communication duration of each user depends on its bit rate. 
The maximum number of PRBs in an eNB, i.e. the capacity, is 
fixed to 24 PRBs with 8 PRBs in each sub-band. The number 
of PRBs that can be allocated to a user can vary from 1 to 4, 
allocated on the first-come first-served basis. No mobility is 
taken into account. 

For each new value of a, the simulator runs for 2500 time 
steps (seconds) to allow the convergence of the processed 
KPIs. The BCR and FTT KPIs used by the SLAH algorithm 
are averaged on an interval varying from 500 to 2500 seconds 
while discarding the samples of first 500 seconds during which 
the network reaches a steady state. It is noted that for a 
given traffic demand, the BCR provides a capacity indicator 
while the FTT is more related to the user perceived QoS. The 
simulated LTE system includes a simple admission control 
process based on signal strength: A simple admission control 
has been implemented based on signal strength. A mobile 
selects the eNB with the highest Reference Signal Received 
Power (RSRP) and is admitted if it is above -104 dBm and 
if at least one PRB is available. The mobile throughput is 
calculated from SINR using quality tables obtained from link 
level simulations. The SINR and consequently the bit rate 
of a mobile are updated after each simulation time step. 
The interference matrix elements used in equations (|4|i, ^ 
and dHJ are calculated only once for the reference solution 
(see paragraph below) during a longer time interval varying 
from 500 to 7000 seconds to achieve accurate average results. 

Reference Solution: An optimal default value for a, known 
as reference solution, is calculated as 0.5 for all eNBs in 
the network. The default a value is determined by varying 
it simultaneously for all eNBs from 0.0125 to 1 in steps of 
0.0125. For each a, the network performance is assessed in 
terms of the mean BCR and mean FTT. The minimum values 
for both BCR and FTT are obtained in the a interval [0.5, 0.7]. 
The value of a = 0.5 is selected as the default value due to 
the smaller inter-cellular interference and the minimum energy 
consumption in the network. 

B. Automated Healing Scenario 

A problematic eNB with the worst performance in the sim- 
ulated network (in terms of BCR and FTT), namely eNBc=i3, 
is selected for automated healing using the SLAH algorithm. 
The eNBj, where j e NSl = {14, 15, 22, 23, 43, 45}, is one 
of the six first tier neighbours of eNBc=i3. ac is fixed to 
the reference default value of 0.5. The index set NS2 of the 
second tier neighbours of the problematic eNB consists of 
NS2 = {1, 10, 11, 16, 18, 24, 37, 44}. Denote by optimization 
zone the subnetwork comprising eNBc=i3 and its first tier 
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TABLE II 
Phase-I shows the initially chosen a VALUES . Phase-II shows the 

a VALUES CALCULATED DURING OPTIMIZATION. 



neighbours NSl, and by evaluation zone the subnetwork 
comprising eNBc=i3 and its first two tier neighbours NSl 
and NS2. The eNBs=45 is the eNB most coupled with 



C. Results 

The SLAH algorithm is applied using the generalized in- 
terference matrix (O in (|4]l, with 7=-0.3. The first five values 
of as=45 in Table |ll| are chosen for the initialization phase 
(Phase-I in the Table) of the SLAH. The next seven values 
are calculated iteratively by the SLAH algorithm during the 
optimization phase (Phase-II in the Table). The values of 
aj=i4, aj=i5, ctj=22, ctj=23 and aj=43 are calculated using 
equation (|4|i. In spite of the inherent noise present in the 
generated data, one can see from the values depicted in Phase- 
II that q:s=45 converges in a few iterations. as=45 = 0.46 is 
chosen as the optimized solution. 

Figures |3(a)| and |3(b)| show the mean BCR and FTT data 
points respectively as a function of as=45 together with 



the LR curves for eA^i?c=i3, eNB. 



and eNB 



J =43 



after 



convergence (at the end of the 7 optimization iteration). The 
KPI curves for eNBj^u, eNBj=i5, eNBj=23 and eNBs=45 
are not shown as they have a similar trend. The concentration 
of KPI data points around as=45 = 0.45 indicates the 
convergence of the SLAH algorithm. 

Figure |4(a)| shows the gain brought about by the SLAH 
algorithm for the optimization zone (set NSl of eNBs). The 
mean BCR of the problematic eNBc=i3 is reduced by 45% 
with respect to the reference solution, from 5.28% to 2.9%. 
The average improvement of the mean BCR in the first tier 
(NSl) is 44% with respect to the reference solution. 
In the case of mean FTT, the improvement brought about by 
the SLAH algorithm in the optimized zone with respect to the 
reference solution is shown in Figure |4(b)| The mean FTT of 
eNBc=i3 is reduced by 6.31% and the average improvement 
of the mean FTT in the first tier is 26.6%. This improvement 
is related to the optimized interference management in the first 
tier of the problematic eNB. The decrease in interferences im- 
proves the SINR values and consequently the bit rates and the 
FTT values. Furthermore, the improvement in power resource 
allocation decreases the sojourn time of users that monopolize 
scarce radio resources and results in the improvement in BCR. 
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Figures |5(a)| and |5(b)| show, in descending order, the mean 
BCR and the mean FTT respectively for the reference (square) 
and the optimized (circle) eNBs in the evaluation zone 
(eNBc=i3 U NSl U NS2). It is noted that the order of the 
stations in the two curves of each Figure may not be preserved. 
One can see that on the average, the mean BCR and mean FTT 
in the evaluation zone are improved. The average improvement 
of FTT in the evaluation zone is of 13%. 

VI. Conclusion 

This paper has presented a new approach, the SLAH, for au- 
tomated healing of cells with poor performance. The SLAH is 
an iterative optimization algorithm that uses statistical learning 
in conjunction with a simple optimization module. During each 
iteration, the RRM solution computed by optimization block 
is improved jointly with the improvement in the statistical 
model. The SLAH can be implemented in the management 
plane, e.g. in the OMC in an off-line mode. It has been 
successfully applied to heal a downlink ICIC parameter of an 
eNB with degraded performance due to excess downlink inter- 
cell interference in a LTE network. The proposed approach 
has several attractive features: it is generic and can be easily 
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adapted to deal with different types of faulty parameters; it 
performs well in the presence of noisy data; and it converges in 
a very small number of iterations. The SLAH method provides 
the basis for designing self-healing algorithms. 
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